AITopics | Quảng Ninh Province

Collaborating Authors

Quảng Ninh Province

Multi-Dialect Vietnamese: Task, Dataset, Baseline Models and Challenges

Van Dinh, Nguyen, Dang, Thanh Chi, Nguyen, Luan Thanh, Van Nguyen, Kiet

arXiv.org Artificial IntelligenceOct-4-2024

Vietnamese, a low-resource language, is typically categorized into three primary dialect groups that belong to Northern, Central, and Southern Vietnam. However, each province within these regions exhibits its own distinct pronunciation variations. Despite the existence of various speech recognition datasets, none of them has provided a fine-grained classification of the 63 dialects specific to individual provinces of Vietnam. To address this gap, we introduce Vietnamese Multi-Dialect (ViMD) dataset, a novel comprehensive dataset capturing the rich diversity of 63 provincial dialects spoken across Vietnam. Our dataset comprises 102.56 hours of audio, consisting of approximately 19,000 utterances, and the associated transcripts contain over 1.2 million words. To provide benchmarks and simultaneously demonstrate the challenges of our dataset, we fine-tune state-of-the-art pre-trained models for two downstream tasks: (1) Dialect identification and (2) Speech recognition. The empirical results suggest two implications including the influence of geographical factors on dialects, and the constraints of current approaches in speech recognition tasks involving multi-dialect speech data. Our dataset is available for research purposes.

dataset, dialect, experiment, (17 more...)

arXiv.org Artificial Intelligence

2410.03458

Country:

Asia > Vietnam > Hanoi > Hanoi (0.14)
Asia > Vietnam > Thanh Hóa Province > Thanh Hóa (0.04)
Asia > Vietnam > Hưng Yên Province > Hưng Yên (0.04)
(65 more...)

Genre: Research Report > New Finding (0.66)

Industry: Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CON: Continual Object Navigation via Data-Free Inter-Agent Knowledge Transfer in Unseen and Unfamiliar Places

Terashima, Kouki, Iwata, Daiki, Tanaka, Kanji

arXiv.org Artificial IntelligenceSep-23-2024

This work explores the potential of brief inter-agent knowledge transfer (KT) to enhance the robotic object goal navigation (ON) in unseen and unfamiliar environments. Drawing on the analogy of human travelers acquiring local knowledge, we propose a framework in which a traveler robot (student) communicates with local robots (teachers) to obtain ON knowledge through minimal interactions. We frame this process as a data-free continual learning (CL) challenge, aiming to transfer knowledge from a black-box model (teacher) to a new model (student). In contrast to approaches like zero-shot ON using large language models (LLMs), which utilize inherently communication-friendly natural language for knowledge representation, the other two major ON approaches -- frontier-driven methods using object feature maps and learning-based ON using neural state-action maps -- present complex challenges where data-free KT remains largely uncharted. To address this gap, we propose a lightweight, plug-and-play KT module targeting non-cooperative black-box teachers in open-world settings. Using the universal assumption that every teacher robot has vision and mobility capabilities, we define state-action history as the primary knowledge base. Our formulation leads to the development of a query-based occupancy map that dynamically represents target object locations, serving as an effective and communication-friendly knowledge representation. We validate the effectiveness of our method through experiments conducted in the Habitat environment.

prob 0, robot, student, (14 more...)

arXiv.org Artificial Intelligence

2409.14899

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Vietnam > Quảng Ninh Province > Hạ Long (0.04)
Asia > South Korea > Daegu > Daegu (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.82)

Industry:

Transportation (0.88)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese

Doan, Khang T., Huynh, Bao G., Hoang, Dung T., Pham, Thuc D., Pham, Nhat H., Nguyen, Quan T. M., Vo, Bang Q., Hoang, Suong N.

arXiv.org Artificial IntelligenceAug-23-2024

In this report, we introduce Vintern-1B, a reliable 1-billion-parameters multimodal large language model (MLLM) for Vietnamese language tasks. By integrating the Qwen2-0.5B-Instruct language model with the InternViT-300M-448px visual model, Vintern-1B is optimized for a range of applications, including optical character recognition (OCR), document extraction, and general question-answering in Vietnamese context. The model is fine-tuned on an extensive dataset of over 3 million image-question-answer pairs, achieving robust performance and reliable results across multiple Vietnamese language benchmarks like OpenViVQA and ViTextVQA. Vintern-1B is small enough to fit into various on-device applications easily. Additionally, we have open-sourced several Vietnamese vision question answering (VQA) datasets for text and diagrams, created with Gemini 1.5 Flash. Our models are available at: https://huggingface.co/5CD-AI/Vintern-1B-v2.

dataset, vertex, vintern-1b, (12 more...)

arXiv.org Artificial Intelligence

2408.1248

Country:

Asia > Vietnam > Bạc Liêu Province > Bạc Liêu (0.14)
Asia > Vietnam > Khánh Hòa Province (0.05)
Asia > Vietnam > Quảng Ninh Province (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

A Hybrid-Layered System for Image-Guided Navigation and Robot Assisted Spine Surgery

T, Suhail Ansari, Maik, Vivek, Naheem, Minhas, Ram, Keerthi, Lakshmanan, Manojkumar, Sivaprakasam, Mohanasankar

arXiv.org Artificial IntelligenceJun-7-2024

In response to the growing demand for precise and affordable solutions for Image-Guided Spine Surgery (IGSS), this paper presents a comprehensive development of a Robot-Assisted and Navigation-Guided IGSS System. The endeavor involves integrating cutting-edge technologies to attain the required surgical precision and limit user radiation exposure, thereby addressing the limitations of manual surgical methods. We propose an IGSS workflow and system architecture employing a hybrid-layered approach, combining modular and integrated system architectures in distinctive layers to develop an affordable system for seamless integration, scalability, and reconfigurability. We developed and integrated the system and extensively tested it on phantoms and cadavers. The proposed system's accuracy using navigation guidance is 1.020 mm, and robot assistance is 1.11 mm on phantoms. Observing a similar performance in cadaveric validation where 84% of screw placements were grade A, 10% were grade B using navigation guidance, 90% were grade A, and 10% were grade B using robot assistance as per the Gertzbein-Robbins scale, proving its efficacy for an IGSS. The evaluated performance is adequate for an IGSS and at par with the existing systems in literature and those commercially available. The user radiation is lower than in the literature, given that the system requires only an average of 3 C-Arm images per pedicle screw placement and verification

accuracy, architecture, module, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/SII58957.2024.10417647

2406.04644

Country:

Asia > India > Tamil Nadu > Chennai (0.04)
Europe > Italy (0.04)
Asia > Vietnam > Quảng Ninh Province > Hạ Long (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Therapeutic Area > Orthopedics/Orthopedic Surgery (0.72)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.72)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

AAPMT: AGI Assessment Through Prompt and Metric Transformer

Huang, Benhao

arXiv.org Artificial IntelligenceMar-27-2024

The emergence of text-to-image models marks a significant milestone in the evolution of AI-generated images (AGIs), expanding their use in diverse domains like design, entertainment, and more. Despite these breakthroughs, the quality of AGIs often remains suboptimal, highlighting the need for effective evaluation methods. These methods are crucial for assessing the quality of images relative to their textual descriptions, and they must accurately mirror human perception. Substantial progress has been achieved in this domain, with innovative techniques such as BLIP and DBCNN contributing significantly. However, recent studies, including AGIQA-3K, reveal a notable discrepancy between current methods and state-of-the-art (SOTA) standards. This gap emphasizes the necessity for a more sophisticated and precise evaluation metric. In response, our objective is to develop a model that could give ratings for metrics, which focuses on parameters like perceptual quality, authenticity, and the correspondence between text and image, that more closely aligns with human perception. In our paper, we introduce a range of effective methods, including prompt designs and the Metric Transformer. The Metric Transformer is a novel structure inspired by the complex interrelationships among various AGI quality metrics. The code is available at https://github.com/huskydoge/CS3324-Digital-Image-Processing/tree/main/Assignment1

dataset, metric transformer, transformer, (13 more...)

arXiv.org Artificial Intelligence

2403.19101

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > Vietnam > Quảng Ninh Province > Hạ Long (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.67)

Industry: Media > Photography (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.88)

Add feedback

From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques for Taming Long Sentences

Passali, Tatiana, Chatzikyriakidis, Efstathios, Andreadis, Stelios, Stavropoulos, Thanos G., Matonaki, Anastasia, Fachantidis, Anestis, Tsoumakas, Grigorios

arXiv.org Artificial IntelligenceDec-8-2023

Long sentences have been a persistent issue in written communication for many years since they make it challenging for readers to grasp the main points or follow the initial intention of the writer. This survey, conducted using the PRISMA guidelines, systematically reviews two main strategies for addressing the issue of long sentences: a) sentence compression and b) sentence splitting. An increased trend of interest in this area has been observed since 2005, with significant growth after 2017. Current research is dominated by supervised approaches for both sentence compression and splitting. Yet, there is a considerable gap in weakly and self-supervised techniques, suggesting an opportunity for further research, especially in domains with limited data. In this survey, we categorize and group the most representative methods into a comprehensive taxonomy. We also conduct a comparative evaluation analysis of these methods on common sentence compression and splitting datasets. Finally, we discuss the challenges and limitations of current methods, providing valuable insights for future research directions. This survey is meant to serve as a comprehensive resource for addressing the complexities of long sentences. We aim to enable researchers to make further advancements in the field until long sentences are no longer a barrier to effective communication.

compression, computational linguistic, sentence compression, (11 more...)

arXiv.org Artificial Intelligence

2312.05172

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
Oceania > Australia > New South Wales > Sydney (0.14)
North America > United States > Washington > King County > Seattle (0.14)
(46 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(5 more...)

Add feedback

Robust Approximation Algorithms for Non-monotone $k$-Submodular Maximization under a Knapsack Constraint

Ha, Dung T. K., Pham, Canh V., Tran, Tan D., Hoang, Huan X.

arXiv.org Artificial IntelligenceSep-21-2023

The problem of non-monotone $k$-submodular maximization under a knapsack constraint ($\kSMK$) over the ground set size $n$ has been raised in many applications in machine learning, such as data summarization, information propagation, etc. However, existing algorithms for the problem are facing questioning of how to overcome the non-monotone case and how to fast return a good solution in case of the big size of data. This paper introduces two deterministic approximation algorithms for the problem that competitively improve the query complexity of existing algorithms. Our first algorithm, $\LAA$, returns an approximation ratio of $1/19$ within $O(nk)$ query complexity. The second one, $\RLA$, improves the approximation ratio to $1/5-\epsilon$ in $O(nk)$ queries, where $\epsilon$ is an input parameter. Our algorithms are the first ones that provide constant approximation ratios within only $O(nk)$ query complexity for the non-monotone objective. They, therefore, need fewer the number of queries than state-of-the-the-art ones by a factor of $\Omega(\log n)$. Besides the theoretical analysis, we have evaluated our proposed ones with several experiments in some instances: Influence Maximization and Sensor Placement for the problem. The results confirm that our algorithms ensure theoretical quality as the cutting-edge techniques and significantly reduce the number of queries.

algorithm, constraint, query complexity, (13 more...)

arXiv.org Artificial Intelligence

2309.12025

Country:

Asia > Vietnam > Hanoi > Hanoi (0.14)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.77)

Add feedback

Enhancing Machine Learning Performance with Continuous In-Session Ground Truth Scores: Pilot Study on Objective Skeletal Muscle Pain Intensity Prediction

Faremi, Boluwatife E., Stavres, Jonathon, Oliveira, Nuno, Zhou, Zhaoxian, Sung, Andrew H.

arXiv.org Artificial IntelligenceAug-1-2023

Machine learning (ML) models trained on subjective self-report scores struggle to objectively classify pain accurately due to the significant variance between real-time pain experiences and recorded scores afterwards. This study developed two devices for acquisition of real-time, continuous in-session pain scores and gathering of ANS-modulated endodermal activity (EDA).The experiment recruited N = 24 subjects who underwent a post-exercise circulatory occlusion (PECO) with stretch, inducing discomfort. Subject data were stored in a custom pain platform, facilitating extraction of time-domain EDA features and in-session ground truth scores. Moreover, post-experiment visual analog scale (VAS) scores were collected from each subject. Machine learning models, namely Multi-layer Perceptron (MLP) and Random Forest (RF), were trained using corresponding objective EDA features combined with in-session scores and post-session scores, respectively. Over a 10-fold cross-validation, the macro-averaged geometric mean score revealed MLP and RF models trained with objective EDA features and in-session scores achieved superior performance (75.9% and 78.3%) compared to models trained with post-session scores (70.3% and 74.6%) respectively. This pioneering study demonstrates that using continuous in-session ground truth scores significantly enhances ML performance in pain intensity characterization, overcoming ground truth sparsity-related issues, data imbalance, and high variance. This study informs future objective-based ML pain system training.

artificial intelligence, eda, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.00886

Country:

North America > United States > Mississippi > Forrest County > Hattiesburg (0.14)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Vietnam > Quảng Ninh Province (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)

Add feedback

VariTex: Variational Neural Face Textures

Bühler, Marcel C., Meka, Abhimitra, Li, Gengyan, Beeler, Thabo, Hilliges, Otmar

arXiv.org Artificial IntelligenceApr-13-2021

Deep generative models have recently demonstrated the ability to synthesize photorealistic images of human faces with novel identities. A key challenge to the wide applicability of such techniques is to provide independent control over semantically meaningful parameters: appearance, head pose, face shape, and facial expressions. In this paper, we propose VariTex - to the best of our knowledge the first method that learns a variational latent feature space of neural face textures, which allows sampling of novel identities. We combine this generative model with a parametric face model and gain explicit control over head pose and facial expressions. To generate images of complete human heads, we propose an additive decoder that generates plausible additional details such as hair. A novel training scheme enforces a pose independent latent space and in consequence, allows learning of a one-to-many mapping between latent codes and pose-conditioned exterior regions. The resulting method can generate geometrically consistent images of novel identities allowing fine-grained control over head pose, face shape, and facial expressions, facilitating a broad range of downstream tasks, like sampling novel identities, re-posing, expression transfer, and more.

expression, neural texture, texture, (17 more...)

arXiv.org Artificial Intelligence

2104.05988

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Virginia (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

6 Privacy Solutions for Big Data and Machine Learning

#artificialintelligenceNov-2-2020, 22:35:43 GMT

Travelers who wander the banana pancake trail through Southeast Asia will all get roughly the same experience. They'll eat crummy food on one of fifty boats floating around Ha Long Bay, then head up to the highlands of Sa Pa for a faux cultural experience with hill tribes that grow dreadful cannabis. After that, it's on to Laos to float the river in Vang Vieng while smashed on opium tea. Eventually, you'll see someone wearing a t-shirt with the classic slogan – "same same, but different." The origins of this phrase surround the Southeast Asian vendors who often respond to queries about the authenticity of fake goods they're selling with "same same, but different." It's a phrase that appropriately describes how the technology world loves to spin things as fresh and new when they've hardly changed at all.

data mining, machine learning, platform, (14 more...)

#artificialintelligence

Country:

Asia > Vietnam > Quảng Ninh Province > Hạ Long (0.25)
Asia > Southeast Asia (0.25)
Asia > Laos (0.25)
(2 more...)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.88)
Information Technology > Data Science > Data Mining > Big Data (0.41)

Add feedback